Agent Genie++: Add README and deployment images by nitinaggarwal-databricks · Pull Request #651 · databrickslabs/sandbox

nitinaggarwal-databricks · 2026-03-05T04:37:04Z

Summary

Adds documentation and images for the Agent Genie++ app to the agent_genie directory.

Changes

README.md – Documentation covering:
- Overview and key features
- Architecture and analytics capabilities
- Deployment instructions (Marketplace and manual)
- Troubleshooting (OAuth scopes, permissions)
- User permissions and monitoring
images/ – Screenshots for the deployment workflow and app features

Testing

Verified README renders correctly
Image paths reference ./images/media/ correctly

Copilot

Pull request overview

This PR introduces the Agent Genie++ Databricks App assets under agent_genie/ (backend, UI, deployment manifests, and documentation), and also removes the legacy dbsql/serverless_migration_dash documentation.

Changes:

Add Agent Genie++ FastAPI backend and web UI (templates) to support Genie + Serving Endpoint workflows, plus PDF upload/Q&A.
Add deployment/config artifacts (databricks.yml, app.yaml, manifest.yaml) and Python dependencies (requirements.txt).
Add extensive agent_genie/README.md with screenshots; remove dbsql/serverless_migration_dash/README.md (directory appears removed).

Reviewed changes

Copilot reviewed 18 out of 34 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
manifest.mf	Adds a root manifest file.
dbsql/serverless_migration_dash/README.md	Removes the Serverless Migration Dash README (and likely the directory).
agent_genie/README.md	Adds end-user/deployment documentation and screenshots.
agent_genie/app.py	Adds the FastAPI app implementing Genie queries, AI-function routing, and PDF session handling.
agent_genie/templates/index.html	Adds the UI (chat, table rendering, PDF upload, feedback capture).
agent_genie/helper.py	Adds Genie polling + OAuth token minting and optional Tavily helper.
agent_genie/table_extraction.py	Adds UC table/column introspection helpers via REST calls.
agent_genie/my_prompts.py	Adds prompt templates for column selection, query rephrasing, and classification.
agent_genie/manual_ai_content.py	Adds reference content used to construct AI-function prompts.
agent_genie/requirements.txt	Adds Python dependencies for the app.
agent_genie/databricks.yml	Adds Databricks bundle configuration for app + resources + installer job.
agent_genie/app.yaml	Adds app runtime command/env bindings for Apps deployment.
agent_genie/manifest.yaml	Adds app resource specs and user API scopes.

Comments suppressed due to low confidence (6)

agent_genie/templates/index.html:2574

uploadPdf() throws new Error(data.detail || 'Upload failed'), but the /upload_pdf backend returns errors under the error key (and only uses detail for FastAPI-raised HTTPExceptions). This drops useful error messages for most failure cases (encrypted/invalid PDFs, etc.). Please align the frontend to read data.error as well, or standardize backend error responses to always use detail.

                try {
                    const res = await fetch('/upload_pdf', { method: 'POST', body: form });
                    const data = await res.json();
                    if (!res.ok) throw new Error(data.detail || 'Upload failed');

agent_genie/table_extraction.py:56

Both requests.get(...) calls are made without a timeout. In production this can hang worker threads indefinitely if the Databricks control plane or network stalls. Please set a reasonable timeout (and consider retry/backoff for transient errors) for these UC API calls.

        # Make the request
        response = requests.get(endpoint, headers=headers, params=params)

agent_genie/table_extraction.py:110

get_table_columns() uses requests.get(...) without a timeout, so a stalled request can hang the app. Please add a timeout (and ideally handle common HTTP errors/429s with retries/backoff).

        # Make the request
        response = requests.get(endpoint, headers=headers)

agent_genie/app.py:5

The PR description says this change only adds documentation and images, but the diff introduces a full FastAPI app (app.py/templates), dependency set (requirements.txt), deployment configs (databricks.yml/app.yaml/manifest.yaml), and also removes the dbsql/serverless_migration_dash README/dir. Please update the PR description/title to match the actual scope, or split this into separate PRs (docs/images vs application code vs dbsql cleanup) to make review/rollback safer.

from fastapi import FastAPI, Request, UploadFile, File, HTTPException
from fastapi.responses import JSONResponse
from fastapi.templating import Jinja2Templates
import pandas as pd
import logging

agent_genie/requirements.txt:18

openai is declared twice in this requirements file (once earlier and again here), which can confuse dependency management and makes intended constraints unclear. Please deduplicate the entry (and consider adding consistent version constraints).

python-dotenv
openai
backoff

agent_genie/templates/index.html:1352

The UI hard-codes an individual email address as the support contact. This is brittle for reuse/redistribution and can create privacy/ownership issues. Consider replacing it with a generic support alias or injecting it via configuration/env so it can be set per deployment.

        <div class="section-description">
            A multi-agent AI engine that answers questions, forecasts trends, classifies data, summarizes records, translates languages, and recommends next-best actions. For any questions or feedback, please reach out to nitin.aggarwal@databricks.com 
        </div>

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copilot · 2026-03-05T09:24:45Z

agent_genie/databricks.yml

+
+  # --- Installer job to persist Tavily key into a secret and write optional config ---
+  jobs:
+    install_app:
+      name: "${var.project_name} - Install/Configure"
+      tasks:
+        - task_key: configure_app
+          notebook_task:
+            notebook_path: "./notebooks/setup_app"   # create this notebook
+            base_parameters:
+              TAVILY_API_KEY: "${var.tavily_api_key}"
+              SECRET_SCOPE: "${var.secret_scope}"
+              SECRET_KEY: "${var.secret_key}"
+          # Add compute for your workspace (example placeholders):
+          # existing_cluster_id: "<your-cluster-id>"
+          # OR:
+          # job_clusters:
+          #   - job_cluster_key: "install_cluster"
+          #     new_cluster:
+          #       spark_version: "14.3.x-scala2.12"
+          #       node_type_id: "i3.xlarge"
+          #       num_workers: 0


This installer job task points at ./notebooks/setup_app, but that notebook path is not present in the repo/PR. Bundle deployment will fail when it tries to create/run this job; please add the notebook or remove/disable the installer job block until implemented.

Suggested change

# --- Installer job to persist Tavily key into a secret and write optional config ---

jobs:

install_app:

name: "${var.project_name} - Install/Configure"

tasks:

- task_key: configure_app

notebook_task:

notebook_path: "./notebooks/setup_app" # create this notebook

base_parameters:

TAVILY_API_KEY: "${var.tavily_api_key}"

SECRET_SCOPE: "${var.secret_scope}"

SECRET_KEY: "${var.secret_key}"

# Add compute for your workspace (example placeholders):

# existing_cluster_id: "<your-cluster-id>"

# OR:

# job_clusters:

# - job_cluster_key: "install_cluster"

# new_cluster:

# spark_version: "14.3.x-scala2.12"

# node_type_id: "i3.xlarge"

# num_workers: 0

Copilot · 2026-03-05T09:24:45Z

agent_genie/templates/index.html

+                    try {
+                        // Try to parse and pretty-print JSON
+                        const parsed = JSON.parse(content);
+                        formatted = '<pre>' + JSON.stringify(parsed, null, 2) + '</pre>';


Potential XSS: when JSON parsing succeeds, formatted is replaced with <pre> + JSON.stringify(...) without escaping, and the caller later assigns it via innerHTML. If JSON values contain HTML, it can execute in the page. Please escape the JSON string (or render via textContent on a created <pre> element) before inserting into the DOM.

Suggested change

formatted = '<pre>' + JSON.stringify(parsed, null, 2) + '</pre>';

const jsonString = JSON.stringify(parsed, null, 2);

const escapedJson = escapeHtml(jsonString);

formatted = '<pre>' + escapedJson + '</pre>';

Copilot · 2026-03-05T09:24:46Z

agent_genie/app.py

+            try:
+                tables_result = get_tables(catalog_name, schema_name, fallback_workspace_url, None)
+                if tables_result["success"]:


get_tables(...) is called with access_token=None, which forces table_extraction.py to fall back to ACCESS_TOKEN from the environment. In Databricks Apps/OBO mode you already mint an OAuth token via Config() (fallback_access_token), so schema discovery can fail even when auth is available. Please pass the minted token into this call (or refactor table_extraction to use the SDK).

Copilot · 2026-03-05T09:24:46Z

agent_genie/app.py

+                    for table in tables_result["tables"]:
+                        table_name = table["name"]
+                        try:
+                            columns_result = get_table_columns(catalog_name, schema_name, table_name, fallback_workspace_url, None)


get_table_columns(...) is called with access_token=None, which again falls back to ACCESS_TOKEN from env. To make this work reliably in OBO environments, pass fallback_access_token (or a freshly minted token) into this call as well.

Suggested change

columns_result = get_table_columns(catalog_name, schema_name, table_name, fallback_workspace_url, None)

columns_result = get_table_columns(catalog_name, schema_name, table_name, fallback_workspace_url, fallback_access_token)

pohlposition

LGTM

alexott

Why it removes files in other directories, like dbsql/serverless_migration_dash?

alexott · 2026-03-05T18:53:00Z

manifest (1).mf

why we need a copy of the file? Which of them is correct

I am only working on the dir 'agent_genie'. So, I am not sure why it removed dbsql/serverless_migration_dash. Manifest file copy is again a mistake. Let me fix both

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

…ON, disable installer job until notebook exists

…ilot fixes

nitinaggarwal-databricks · 2026-03-05T19:57:44Z

Summary of fixes: restored dbsql/serverless_migration_dash, removed duplicate manifest, addressed Copilot (token, XSS, installer job). Ran [brief list of what you tested]. Ready for re-review.

@alexott – could you re-review when you have a moment?

alexott · 2026-03-06T12:26:17Z

dbsql/cost_per_query/PrPr/DBSQL Cost Dashboard (PrPr).lvdash.json

Why this is removed as well?

alexott · 2026-03-06T12:26:33Z

manifest (1).mf

This file isn't necessary

nitinaggarwal-databricks added 4 commits October 30, 2025 23:35

Agent Genie app: initial import + updates

6d81346

resolved conflict

9f74fb4

Resolve conflict by keeping main’s dbsql files

d618462

Add README and images for Agent Genie++

788992b

nitinaggarwal-databricks requested a review from a team as a code owner March 5, 2026 04:37

nitinaggarwal-databricks requested a review from QuentinAmbard March 5, 2026 04:37

alexott requested a review from Copilot March 5, 2026 09:17

Copilot started reviewing on behalf of alexott March 5, 2026 09:18 View session

Copilot AI reviewed Mar 5, 2026

View reviewed changes

pohlposition approved these changes Mar 5, 2026

View reviewed changes

alexott requested changes Mar 5, 2026

View reviewed changes

alexott requested a review from Copilot March 5, 2026 19:08

Copilot AI reviewed Mar 5, 2026

View reviewed changes

nitinaggarwal-databricks added 5 commits March 5, 2026 14:38

Restore dbsql/serverless_migration_dash

8c63384

Remove duplicate manifest (1).mf

7651042

Address Copilot review: token for schema discovery, XSS escape for JS…

6dd4aed

…ON, disable installer job until notebook exists

Merge upstream/main into agent-genie++; resolve conflicts keeping Cop…

d5d7ddd

…ilot fixes

Remove duplicate manifest (1).mf re-added by upstream merge

64e6ce4

alexott requested changes Mar 6, 2026

View reviewed changes

-                        formatted = '<pre>' + JSON.stringify(parsed, null, 2) + '</pre>';
+                        const jsonString = JSON.stringify(parsed, null, 2);
+                        const escapedJson = escapeHtml(jsonString);
+                        formatted = '<pre>' + escapedJson + '</pre>';

	columns_result = get_table_columns(catalog_name, schema_name, table_name, fallback_workspace_url, None)
	columns_result = get_table_columns(catalog_name, schema_name, table_name, fallback_workspace_url, fallback_access_token)

Conversation

nitinaggarwal-databricks commented Mar 5, 2026

Summary

Changes

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

pohlposition left a comment

Choose a reason for hiding this comment

Uh oh!

alexott left a comment

Choose a reason for hiding this comment

Uh oh!

alexott Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

nitinaggarwal-databricks Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

nitinaggarwal-databricks commented Mar 5, 2026

Uh oh!

alexott Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

alexott Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants